2,482 research outputs found

    Learning Recursive Segments for Discourse Parsing

    Full text link
    Automatically detecting discourse segments is an important preliminary step towards full discourse parsing. Previous research on discourse segmentation have relied on the assumption that elementary discourse units (EDUs) in a document always form a linear sequence (i.e., they can never be nested). Unfortunately, this assumption turns out to be too strong, for some theories of discourse like SDRT allows for nested discourse units. In this paper, we present a simple approach to discourse segmentation that is able to produce nested EDUs. Our approach builds on standard multi-class classification techniques combined with a simple repairing heuristic that enforces global coherence. Our system was developed and evaluated on the first round of annotations provided by the French Annodis project (an ongoing effort to create a discourse bank for French). Cross-validated on only 47 documents (1,445 EDUs), our system achieves encouraging performance results with an F-score of 73% for finding EDUs.Comment: published at LREC 201

    Predicting globally-coherent temporal structures from texts via endpoint inference and graph decomposition

    Get PDF
    International audienceAn elegant approach to learning temporal order- ings from texts is to formulate this problem as a constraint optimization problem, which can be then given an exact solution using Integer Linear Programming. This works well for cases where the number of possible relations between temporal entities is restricted to the mere precedence rela- tion [Bramsen et al., 2006; Chambers and Jurafsky, 2008], but becomes impractical when considering all possible interval relations. This paper proposes two innovations, inspired from work on temporal reasoning, that control this combinatorial blow-up, therefore rendering an exact ILP inference viable in the general case. First, we translate our network of constraints from temporal intervals to their end- points, to handle a drastically smaller set of con- straints, while preserving the same temporal infor- mation. Second, we show that additional efficiency is gained by enforcing coherence on particular sub- sets of the entire temporal graphs. We evaluate these innovations through various experiments on TimeBank 1.2, and compare our ILP formulations with various baselines and oracle systems

    Comparison of different algebras for inducing the temporal structure of texts

    Get PDF
    International audienceThis paper investigates the impact of using different temporal algebras for learning temporal relations between events. Specifically, we compare three interval-based algebras: Allen \shortcite{Allen83} algebra, Bruce \shortcite{Bruce72} algebra, and the algebra derived from the TempEval-07 campaign. These algebras encode different granularities of relations and have different inferential properties. They in turn behave differently when used to enforce global consistency constraints on the building of a temporal representation. Through various experiments on the TimeBank/AQUAINT corpus, we show that although the TempEval relation set leads to the best classification accuracy performance, it is too vague to be used for enforcing consistency. By contrast, the other two relation sets are similarly harder to learn, but more useful when global consistency is important. Overall, the Bruce algebra is shown to give the best compromise between learnability and expressive power

    Special issue - Beyond clickbait and commerce: The ethics, possibilities and challenges of not-for-profit media

    Get PDF
    This special issue of Ethical Space explores the ethical dilemmas arising in the turbulent journalistic environment created by digital transformation and its impact on the traditional media business model

    Constrained decoding for text-level discourse parsing

    Get PDF
    International audienceThis paper presents a novel approach to document-based discourse analysis by performing a global A* search over the space of possible structures while optimizing a global criterion over the set of potential coherence relations. Existing approaches to discourse analysis have so far relied on greedy search strategies or restricted themselves to sentence-level discourse parsing. Another advantage of our approach, over other global alternatives (like Maximum Spanning Tree decoding algorithms), is its flexibility in being able to integrate constraints (including linguistically motivated ones like the Right Frontier Constraint). Finally, our paper provides the first discourse parsing system for French; our evaluation is carried out on the Annodis corpus. While using a lot less training data than earlier approaches than previous work on English, our system manages to achieve state-of-the-art results, with F1-scores of 66.2 and 46.8 when compared to unlabeled and labeled reference structures

    The Nature Of Influenza Virus Virulence/Pathogenicity

    Get PDF

    Ecological impacts of small dams on South African rivers Part 1: Drivers of change – water quantity and quality

    Get PDF
    Impacts of large dams are well-known and quantifiable, while small dams have generally been perceived as benign, both socially and environmentally. The present study quantifies the cumulative impacts of small dams on the water quality (physico-chemistry and invertebrate biotic indices) and quantity (discharge) of downstream rivers in 2 South African regions. The information from 2 South African national databases was used for evaluating the cumulative impacts on water quality and quantity. Physico-chemistry and biological data were obtained from the River Health Programme, and discharge data at stream flow gauges was obtained from the Hydrological Information System. Multivariate analyses were conducted to establish broad patterns for cumulative impacts of small dams across the 2 regions – Western Cape (winter rainfall, temperate, south-western coast) and Mpumalanga (summer rainfall, tropical, eastern coast). Multivariate analyses found that the changes in macroinvertebrate indices and the stream’s physico-chemistry were more strongly correlated with the density of small dams in the catchment (as a measure of cumulative impact potential) relative to the storage capacity of large dams. T-tests on the data, not including samples with upstream large dams, indicated that the high density of small dams significantly reduced low flows and increased certain physico-chemistry variables (particularly total dissolved salts) in both the regions, along with associated significant reductions in a macroinvertebrate index (SASS4 average score per taxon). Regional differences were apparent in the results for discharge reductions and the macroinvertebrate index. The results suggest that the cumulative effect of a high number of small dams is impacting the quality and quantity of waters in South African rivers and that these impacts need to be systematically incorporated into the monitoring protocol of the environmental water requirements. Keywords: cumulative impacts, regional comparison, macroinvertebrate indices, measures of small-dam impact potential, average score per taxo
    • …
    corecore